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Abstract 

In  this  paper,  we  study  a  query  language  about  databases  evolving  in  (infinite) 
time.  The  syntax  of  the  query  language  is  based  on  a  predicate  temporal  logic.  The 
semantics  of  the  language  is  defined  with  infinite  sequences  of  database  states,  which 
in  this  paper  are  determined  either  by  pure  Datalog  programs  or  by  negated  Datalog 
programs  with  inflationary  semantics.  In  general,  other  mechanisms  for  defining  the 
semantics,  such  as  production  systems,  can  be  used.  We  analyze  the  relative  expressive 
power  of  such  a  query  language  and  the  standard  Datalog  queries  for  both  pure  and 
negated  Datalog  programs.  We  show  that  our  query  language  has  more  expressive 
power  than  Datalog  queries  for  both  pure  and  negated  Datalog  programs  in  general. 
However,  we  also  prove  a  surprising  technical  result  that  the  existential  fragment  of 
temporal  logic  has  the  same  expressive  power  as  DataJog  queries  for  negated  Datalog 
programs  with  inflationary  semantics.  This  result  implies  the  collapse  of  the  existential 
fragment  of  temporal  logic  for  such  programs:  any  temporal  logic  formula  from  that 
fragment  can  be  reduced  to  an  equivalent  formula  with  a  single  possibility  operator. 


1      Introduction 

Consider  a  Datalog  program  P,  adapted  from  [U1188,  p.   103],  that  computes  cousin  rela- 
tionship: 

cousin(X,Y)     :-   parent(X,Xp)    &  parent(Y,Yp)    &  parent(XP,Z)    &   parent(YP,Z) 
&   XP   7^   YP. 
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cousin(X,Y)     :-   parent(X,Xp)    k  parent(Y,Yp)    &   cousin(Xp ,Yp) . 

We  would  like  to  compute  the  relation  closerCousins,  defined  by  closerCousinsi x ,  y,u,v) 
if  and  only  if  cousin{x,y)  and  cou3in{u,v}  and  x  and  y  are  "closer  related"  than  u  and  v 
in  the  obvious  sense. 

closerCousins  can  be  computed  with  a  temporal  logic  [Kr687,McArt76.ReUr71]  for- 
mula cousin(x,y)  before  cousin(u,v),  assuming  appropriate  temporal  semantics.  In  this 
paper,  we  formally  define  both  syntax  and  semantics  of  temporal  logic  queries. 

To  define  semantics  of  temporal  logic  queries,  we  have  to  introduce  a  new  Datalog 
semantics  first.  Traditionally,  there  is  no  notion  of  time  associated  with  Datalog  programs: 
all  the  cousins  belong  to  the  same  "snapshot"  of  cousin  relation.  Therefore.  Datalog 
semantics  is  associated  with  the  state  of  the  database  at  the  "fixpoint  time."  We  call 
this  semantics  static  because  the  intermediate  states  of  the  database  are  of  no  semantic 
interest,  Eind  only  the  (static)  state  at  the  fixpoint  time  is  meaningful. 

In  this  paper,  we  adapt  Datalog  to  describe  an  evolution  of  a  database  by  introducing 
time.  Specifically,  we  assume  that  time  instances  axe  described  by  natural  numbers  with 
the  present  being  time  instance  0.^  We  can  now  define  the  evolution  of  a  database  in 
time.  Given  £),,  the  state  of  a  database  at  some  time  instance  i  >  0;  D,+i,  the  state 
of  the  database  at  time  instance  i  +  1,  is  obtained  by  applying  aJl  the  program  rules 
simultaneously  to  D,.  This  process  results  in  an  (infinite)  sequence  of  database  states 
Do,  Di,D2, . . ..  Clearly,  there  is  no  need  to  assume  the  existence  of  a  fixpoint.  Since  we 
assign  meanings  to  the  elements  of  this  sequence,  we  refer  to  the  interpretation  of  a  Datalog 
program  with  this  sequence  as  dynamic  semantics.  We  will  use  the  term  dynamic  Datalog 
to  refer  to  Datalog  with  dynamic  semantics. 

As  shown  in  [KeTu89],  dynamic  semantics  of  databases  can  be  described  with  various 
formalisms,  some  of  them  unrelated  to  Datalog.  In  this  paper  we  concentrate  on  using 
the  following  two  types  of  Datalog  programs.  The  first  one  will  be  pure  Datalog,  without 
negations.  The  second  will  be  negated  Datalog,  Datalog"'  (negations  being  allowed  only 
in  the  bodies  of  rules)  with  inflationary  semantics  [AbVi88,GuShS6,KoPa88].  We  chose 
inflationary  semantics  as  it  can  be  applied  to  arbitrary  negated  Datalog  programs. 

Once  semantics  for  dynamic  databases  is  defined,  the  next  step  is  to  define  a  query 


'Of  course,  time  can  be  "shifted,"  to  describe  finite  past  too. 


language  about  future  time  instances  of  dynamic  databases.  One  of  the  attractive  features 
of  Datalog  is  the  unity  of  the  data  model  and  the  query  language.  Since  we  selected 
Datalog  to  describe  dynamic  databases,  we  would  like  to  use  the  standcird  query  language 
on  Datalog,  i.e.  Datalog  queries  that  simply  constitute  intentional  database  predicates 
(IDB  predicates)  taken  at  "fixpoint  time."  Although  Datalog  queries  determine  the  state 
of  a  database  at  the  fixpoint  time,  one  can  still  attempt  to  use  them  to  provide  answers 
about  intermediate  database  states  as  well.  This  might  be  achie\'able  by  expanding  the 
underlying  Datalog  program  so  that  it  still  "'remembers"  important  intermediate  states 
at  the  fixpoint  time.  Alternatively,  we  can  choose  a  future-related  fragment  of  predicate 
temporal  logic  [Kr687,McArt76,ReUr71]  as  a  query  language  for  dynamic  Datalog  and  its 
extensions. 

In  this  paper,  we  study  the  relationship  between  Datalog  queries  and  temporal  logic 
queries.  Clearly,  any  Datalog  query  (5(x),  where  x  is  the  vector  of  free  \-ariables  in  Q. 
can  be  expressed  in  a  temporal  logic  for  Datalog  or  Datalog"  program  P  as  oQ(x),  where 
o  is  the  possibiUty  operator  of  temporal  logic  [KroST].  Intuitively,  Q(x)  is  true  at  the 
"fixpoint  time"  if  at  some  point  in  the  execution  of  program  P,  Q(x)  becomes  true  (this 
latter  statement  is  denoted  by  oQ{-x.)  ).  Thus,  temporal  logic  is  at  least  as  powerful  as 
DataJog  queries.  Indeed,  as  we  show,  temporal  logic  queries  have  more  expressive  power 
than  Datalog  queries  for  both  Datalog  and  Datalog"  programs  with  inflationary  semantics. 

It  is  technically  challenging  to  analyze  when  Datalog  queries  are  as  powerful  as  tem- 
poral logic  queries.  We  will  show  that  under  mild  restrictions,  the  existential  fragment 
of  temporal  logic  (to  be  defined  later),  has  the  same  expressive  power  a^  Datalog  queries 
for  safe  Datalog"  with  inflationary  semantics.  Among  other  things,  it  follows  from  this 
that  under  some  mild  restrictions  the  existential  fragment  of  temporal  logic  for  Datalog" 
with  inflationary  semantics  collapses  to  a  single  possibility  operator,  i.e.  any  temporal  logic 
formula  from  the  existential  fragment  can  be  expressed  with  a  single  possibility  operator 
O. 

It  may  be  worthwhile  to  state  succinctly  the  advantages  of  temporal  logic  queries  over 
Datalog  queries  for  dynamic  semantics  defined  by  Datalog  programs  and  its  extensions. 
First,  as  claimed  above,  they  have  more  expressive  power  for  Datalog  and  Datalog"  with 
inflationary  semantics.  Second,  they  do  not  depend  on  existence  of  any  fixpoints  for  some 


of  Datalog  extensions,  e.g.  for  Datalog""  (the  extension  of  Datalog"'  where  negations  are 
allowed  both  in  the  body  and  the  head  of  a  rule)  [AbViSO]'^.  Third,  we  believe  that 
temporal  logic  queries  are  conceptually  simpler.  For  the  specific  case  of  Datalog"  with 
inflationary  semantics,  any  Datalog  query  Q  can  be  expressed  as  oQ  in  temporal  logic; 
however,  as  we  show  later,  it  seems  that  some  of  the  temporal  logic  queries  expressible  as 
Datalog  queries,  can  be  expressed  so  only  in  a  very  complicated  way. 

We  now  describe  the  organization  of  the  rest  of  the  paper  while  highlighting  our 
contributions.  In  Section  2,  we  define  both  some  preliminary  concepts  that  will  be  used 
in  Section  4  and  define  a  query  language  about  dynamic  databases  whose  syntax  is  based 
on  temporal  logic  and  whose  semantics  is  based  on  dynamic  Datalog  and  its  extensions. 
In  Section  3,  we  describe  related  work.  In  Section  4,  we  analyze  the  relative  expressive 
powers  of  temporal  logic  and  Datalog  queries  for  Datalog  and  Datalog"  programs  with 
dynamic  semantics.  We  show  how  domain-independent"^  and  domain-dependent  queries 
from  the  existential  fragment  of  temporal  logic  can  be  expressed  with  Datalog  queries.  Our 
technique  is  based  on  expressing  temporal  queries  with  ordering  ■predicates  we  introduce  in 
the  paper.  These  predicates  characterize  the  relative  times  when  various  tuples  are  inserted 
into  the  IDB  predicates,  and,  most  importantly,  can  be  computed  using  safe  Datalog"  rules. 

2      Preliminaries 

In  this  section,  we  define  preliminary  concepts  that  are  necessary  for  understanding  the 
material  presented  in  Section  4.  However,  certain  concepts,  mainly  in  Section  2.2  consti- 
tute our  original  contribution.  We  will  delineate  these  concepts  from  the  work  of  other 
researchers. 

2.1      Datalog  and  Its  Extensions 

A  pure  Datalog  program  is  a  finite  set  of  Horn  clauses,  i.e.  rules  of  the  form  A  ^—  B\  /\. .  .h 
Bni  where  A  is  a.  positive  literal  and  Bi, . . .  ,  Bn  are  positive  literals  or  built-in  predicates 
X,  =  Xj,  Xi  ^  Xj,  X,  >  Xj,  etc.    (the  x,'s  are  variables  or  constants).    A  negated  Datalog 

"Actually,  as  stated  above,  temporal  logic  queries  can  be  applied  to  any  formalism  able  to  e.xpress  dynamic 
semantics. 

^This  concept  corresponds  to  the  familiar  one  for  static  databases  and  will  be  defined  precisely  later. 


program,  Datalog""  program,  is  a  set  of  rules  in  which  negations  are  allowed  in  the  body 
of  a  rule.  Datalog  program  P  has  two  types  of  predicates.  Extensional  database  predicates 
(EDB  predicates)  never  appear  in  the  head  of  a  rule  in  P.  For  each  EDB  predicate,  P  has 
a  set  of  ground  atoms  (facts)  associated  with  that  predicate.  The  predicates  appearing 
in  the  left-hand  side  of  some  rule  in  P  are  called  intentional  database  predicates  (IDB 
predicates).  For  a  general  survey  of  Datalog,  see  e.g.  [U1188]. 

If  P  is  a  Datalog  or  Datalog"'  program,  the  schema  of  P,  sch{P),  is  the  database 
schema  consisting  of  schemas  of  all  the  relations  occurring  in  P.  A  dom,am  of  program 
P,  DOM p^  is  the  domain  associated  with  schema  sch{P).  The  domains  can  be  infinite  in 
general. 

The  semantics  of  Datalog  programs  is  traditionally  defined  in  terms  of  the  least  fix- 
point,  whereas  there  are  several  types  of  semantics  proposed  in  the  literature  for  Datalog"'. 
In  this  paper,  we  will  use  inflationary  semantics  [AbViS8,GuSh86,KoPa88]  for  Datalog"' 
because  it  is  applicable  to  all  Datalog"  programs.  Both  types  of  semantics  are  defined 
with  a  mapping  EVAL  from  EDB  and  IDB  predicates  to  IDB  predicates  of  a  program  P, 
which,  for  a  given  instance  of  IDB  predicates  £),  computes  all  the  new  facts  derivable  from 
D  and  from  EDB  predicates  by  applying  rules  in  P."*  IDB  predicates  are  initially  made 
empty.  The  least  fixpoint  semantics  is  defined  as  the  fixpoint  of  the  recurrence  equation 
Di+\  =  EVAL(Di),  and  inflationary  semantics  with  the  fixpoint  of  the  recurrence  equation 
Di+i  =  D,  U  EVAL{D,)  (a  fixpoint  is  reached  when  D,  =  D,+i).  These  two  semantics 
coincide  on  pure  Datalog  programs.  We  call  these  types  of  semantics,  based  on  fixpoints, 
static  because  they  are  defined  only  in  terms  of  the  "final"  (static)  result. 

Another  meaning  of  Datalog  and  Datalog"  programs  can  be  associated  with  the  entire 
sequence  of  database  states  Dq,  Di,  D2, . . .  ,  D^, . . .  We  call  this  kind  of  semantics  dynamic 
because  it  defines  the  "dynamics"  of  Datalog  and  Datalog"  programs.  This  dynamic 
semantics  will  become  important  below  when  we  define  semantics  of  temporal  logic. 

Next,  we  define  safety.  Intuitively,  a  rule  is  safe  if  it  cannot  produce  new  constants. 
Formally,  we  define  limited  variables  as  in  [U1188].  A  variable  is  limited  if  it  either  occurs 
positively  in  one  of  the  predicates  in  the  body  of  a  rule  or  can  be  equated  to  a  constant 
or  to  some  other  variable  that  occurs  positively  in  the  body  of  a  rule  through  a  chain  of 


Tor  precise  definition  of  EVAL  see  [U1188,  p.  115] 


equalities.  A  Datalog^  rule  is  called  safe  [AbVi88]  if  all  the  variables  in  the  head  of  the  rule 
are  limited^.  A  Datalog"  program  is  safe  if  all  of  its  rules  are  safe.  Clearly,  safe  Datalog" 
programs  cannot  introduce  new  symbols  when  rules  axe  applied  to  a  database.  Therefore, 
a  safe  Datalog"'  program  with  inflationary  semantics  always  has  a  fixpoint  [AbViSS].  The 
definition  of  safety  presented  here  (and  in  [AbViSS])  differs  from  the  conventional  definition 
of  safety  from  [U1188].  In  [U1188],  a  rule  is  safe  if  all  the  variables  (not  only  in  the  head 
but  also  in  the  body  of  a  rule)  are  limited.  Therefore,  the  conventional  definition  of  safety 
[U1188]  is  more  restrictive  than  the  definition  from  [AbVi88]  adopted  in  this  paper. 

If  program  P  is  not  safe  then  it  generates  all  the  new  symbols  from  DOM p  because  a 
variable  in  the  head  not  occurring  positively  in  the  body  is  not  bound  to  any  value  during 
the  rule  matching  process  and,  therefore,  is  assigned  any  value  from  DOM  p.  Consequently, 
we  get  infinite  relations  for  unsafe  programs.  To  avoid  this  situation,  we  consider  only  safe 
progrcuns  in  this  paper. 

A  Datalog  query  for  a  Datalog  or  Datalog"  program  P  is  a  predicate  Q  appearing 
among  the  IDB  predicates  of  P. 

2.2      Temporal  Logic  as  a  Query  Language 

In  this  section,  we  define  a  query  language  based  on  temporal  logic.  The  syntax  of  a  query 
language  is  based  on  a  predicate  temporal  logic  and  the  semantics  either  on  pure  Datalog 
or  on  Datalog"'.  Separately,  temporal  logic  and  Datalog  have  been  extensively  studied 
before.  One  of  our  contributions  in  this  paper  lies  in  combining  the  two  separate  concepts 
in  one  integral  approach. 

We  consider  the  standard  predicate  temporal  logic  [McArt76,ReUr71]  with  standard 
temporal  operators  o  {possibility),  D  {necessity),  o  {next-moment);  binary  operators  while, 
until,  unless,  before,  atnext  [Kr687];  and  with  time  defined  with  natural  numbers. 
A  atnext  B  is  true  if  .4  is  true  at  the  first  time  B  is  true.  If  B  is  never  true  then  atnext 
yields  false^.  Definitions  of  other  operators  can  be  found  in  [Kro87].  [Kr687]  also  shows 
how  all  of  them  can  be  expressed  in  terms  of  atnext. 


^Actually,  this  definition  of  safety  is  a  rewritten  definition  of  strong  safety  from  [AbVi88]. 
®Our    atnext  operator  differs  from  the  definition  of    atnext  in  [Kr687].    Since  each  of  the    atnext 
operators  can  be  e.xpressed  in  terms  of  the  other,  the  slight  alteration  of  meaning  does  not  affect  our  results. 


Semajitics  of  a  temporal  logic  formula  is  defined  with  a  temporal  structure  [Kro87], 
which  comprises  the  values  of  all  its  predicates  at  all  the  times  in  the  future.  Formally, 
a  temporal  structure  is  a  mapping  K  :  N  ^*  Vi  x  . . .  x  P^,  where  A''  is  a  set  of  natural 
numbers,  and  Vi  is  the  set  of  all  the  possible  interpretations  of  a  predicate  P,.  The  mapping 
A'  assigns  to  each  time  instance  (a  natural  number)  the  truth  values  for  predicates  P,.  We 
will  use  Kt  instead  of  K[t)  to  denote  the  value  of  the  temporal  structure  K  at  time  t.  We 
make  an  important  assumption,  natural  in  the  database  context,  that  domains  of  predicates 
do  not  change  over  time. 

From  the  database  perspective,  a  temporal  structure  can  be  viewed  as  an  infinite 
sequence  of  database  states,  i.e.  Do,  Di,D2, ....  Various  methods  for  defining  sequences  of 
database  states  have  been  considered  in  [KeTu89].  In  this  paper,  we  consider  Datalog  with 
dynamic  semantics  and  Datalog"'  with  dynamic  inflationary  semantics  as  two  mechanisms 
for  defining  sequences  of  database  states.  Specifically,  a  Datalog  or  Datalog""  program  P 
defines  the  temporal  structure  A'^  =  Do,  Di,D2, . . .,  where  £),  is  the  state  of  the  database 
(i.e.  instances  of  predicates  from  sch{P))  at  time  z,  such  that  D,+i  =  EVAL{D,)  for 
i  =  1,  2, 3, . . .,  where  EVAL  is  the  mapping  discussed  on  page  5. 

Sequences  of  database  states  have  been  studied  before  in  [GiTa86,KeTu89,Via87]. 
Also,  [SeSh87]  defines  time  sequences  and  operations  on  them.  This  research  is  related  to 
our  work  because  temporal  structures  can  be  defined  with  these  sequences.  However,  in 
this  paper,  we  concentrate  on  the  sequences  defined  with  Datalog  and  Datalog""  programs 
and  use  them  to  define  semantics  of  temporal  logic  queries. 

A  temporal  logic  formula  (p  on  Datalog  or  Datalog^  program  P,  with  all  the  predicates 
in  (f)  belonging  to  schema  sch{P),  defines  a  query  on  P.  The  answer  to  query  (f)  on  P  is  the 
set  of  tuples  {x  |  Kq{<^{x))},  or,  alternatively,  it  is  defined  with  a  static  (time  independent) 
predicate 

</.p(x)    =   A'o^(<^(x)) 

where  K^  is  the  temporal  structure  determined  by  program  P  (A'^  means  that  K'^  is 
evaluated  at  time  t  =  0).  In  other  words,  (f)p  specifies  the  set  of  tuples  x  satisfying  the 
temporal  logic  formula  (^  at  time  0  with  semantics  determined  by  program  P.  We  will 
refer  to  first  order  logic  predicate  4)p  as  being  induced  by  temporal  logic  formula  (f)  and 
program  P.    Note  that  we  are  interested  in  making  predictions  now  about  events  in  the 


future.  For  a  very  simple  example  consider  4>  :  A{x)  atnext  B(x).  Then  <?i>p(x)  is  true  at 
time  0  if  A{x)  is  true  at  the  first  time  instance  when  B{x)  becomes  true. 

Denote  the  class  of  all  temporal  logic  formulae  as  TL  and  the  class  of  all  quantifier-free 
temporal  logic  formulae  TLq  .  Consider  ail  the  TL  formulae  having  the  form  ( 3xi ) . . .  ( 3x„ ) 
(p{xi, . . .  ,  x„),  where  (?l>(xi, . . .  ,  x„)  is  a  TLq  formula,  and  aJl  the  TL  formulae  equivalent  to 
them  (produce  the  same  answers  for  all  the  temporal  structures).  Call  this  subclass  of  TL 
formulae  the  existential  fragment  of  TL  and  denote  it  as  TL^  . 

Given  a  Datalog  or  Datalog"  program  P,  we  can  ask  either  Datalog  queries  or  queries 
expressed  in  temporal  logic  on  P.  In  Section  4,  we  analyze  the  expressive  powers  of  the 
two  approaches,  but  first,  we  compare  our  work  with  the  work  of  others. 

3      Related  Work 

Our  work  is  related  to  research  on  temporal  databases,  work  on  Datalog  and  its  extensions, 
and  to  temporal  logic. 

The  work  on  temporal  databases  [Ar86,SnoS7,Gad88,ClCr87,NaAh88,LoJo88,Tan86]^ 
is  concerned  with  the  issues  of  representing  finite  sequences  of  database  states  with  the 
actual  (materiahzed)  data  and  querying  these  representations.  In  contrast,  we  are  inter- 
ested in  the  mechanisms  that  generate  infinite  sequences  of  database  states,  in  general, 
such  as  Datalog  and  its  extensions,  as  well  as  in  querying  the  sequences  generated  by  these 
mechanisms. 

There  is  a  large  body  of  research  studying  queries  on  Datalog  and  its  extensions  (we 

again  refer  the  reader  to  [U1188]   ).     However,  as  stated  before,  this  research  does  not 

consider  temporal  aspects  of  Datalog  and  is  mainly  interested  in  fixpoint  queries.    The 

paper  [Chlm88]  constitutes  an  exception  to  this.  It  studies  evolution  of  databases  in  time 

by  introducing  a  single  monadic  function  successor  per  predicate  and  dividing  attributes 

into  temporal  and  non-temporal  types.  That  approach  to  modelling  time  in  the  framework 

of  logic  programming*  is  more  general  than  the  dynamic  semantics  of  Datalog.   This  can 

be  illustrated  with  an  example:  a  Datalog  rule  A{x)  *—  B{x)  can  be  converted  to  the  rule 

B{t,x)  — >■  A{t  -f  l,x)  in  the  formalism  of  [ChImS8].   In  addition,  the  complexity  of  query 

'This  is  list  represents  the  scope  of  the  work  in  the  field  and  is  not  meant  to  be  exhaustive. 
"^Since  a  function  symbol  is  introduced. 


processing  and  questions  related  to  finiteness  of  least  fixpoints  are  studied  in  [ChlmSS]. 
Finally,  the  issues  of  computing  infinite  fixpoints  with  finite  computations  by  introducing 
infinite  objects  are  considered.  However,  the  semantics  of  programs  in  [ChlmSS]  is  still 
defined  in  terms  of  fixpoints  and  is,  therefore,  static.  In  other  words,  queries  are  still 
asked  about  the  predicates  at  the  fixpoint  time  and  not  about  "intermediate"  stages  as 
is  done  in  temporal  logic.  In  contrast,  we  consider  Datalog  and  Datalog""  programs  as 
temporal  structures  for  temporal  logic  queries,  and  we  analyze  the  expressive  power  of 
Datalog  queries  with  temporal  logic  queries  for  Datalog  and  Datalog"  programs. 

There  is  a  large  body  of  work  on  temporal  logic.  Textbooks,  such  as  [Kr687,ReUr71], 
[vBen83],  provided  a  general  description  of  this  research.  However,  this  research  considers 
arbitrary  temporal  structures,  whereas  we  are  interested  in  the  temporal  structures  gener- 
ated by  finite  formalisms  such  as  Datalog  and  its  extensions  and  in  the  expressive  power 
of  temporal  logic  queries  on  these  structures. 

The  paper  [ClWa83]  uses  intentional  logic  [Gal75]  to  provide  a  formal  semantics  of 
historical  databases.  It  also  discusses  how  time-related  queries  can  be  expressed  in  inten- 
tional logic.  It  is  well-known  that  temporal  logics  constitute  fragments  of  the  intentional 
logic  [Gal75].  However,  we  use  temporal  logic  and  not  the  intentional  logic  as  a  query 
language,  as  it  is  better  suited  for  the  relational  model  and  therefore  it  is  more  applicable 
to  Datalog  programs. 

As  was  stated  before,  there  has  been  no  analysis  of  the  relative  expressive  powers  of 
temporal  logic  and  Datalog  queries.  In  the  next  section,  we  carry  such  an  analysis.  In  par- 
ticular, we  prove  a  surprising  result  that  for  negated  Datalog  with  inflationary  semantics, 
Datalog  queries  have  the  same  expressive  power  as  the  domain  independent  (to  be  defined 
later)  existential  fragment  of  temporal  logic. 

4      Temporal  Logic  vs.  Datalog  Queries 

In  this  section,  we  compare  the  expressive  power  of  temporal  logic  and  Datalog  queries 
for  two  groups  of  programs.  The  first  group  constitutes  pure  Datalog  programs,  and 
the  second  group  consists  of  Datalog"  programs  with  the  inflationary  semantics  (we  will 
implicitly  assume  that  semantics  is  inflationary  for  Datalog"  programs  in  the  sequel).  First, 


we  show  that  any  Datalog  query  can  be  expressed  in  temporal  logic  with  a  simple  formula 
for  both  types  of  programs. 

Proposition  1  For  any  Datalog  query  Q  defined  on  Datalog'^  ■program  P  there  is  a  tem- 
poral logic  query  defined  on  P  such  that  the  two  formulae  define  the  same  mapping. 

Proof:  The  temporal  logic  formula  is  simply  {x  |  o  (5(x)}.  A  tuple  x  belongs  to  the 
fixpoint  of  P  if  and  only  if  at  some  point  in  time  (5(x)  is  true.  I 

Since  Datalog  programs  constitute  a  subset  of  Datalog"  programs.  Proposition  1  holds 
for  Datalog  programs  as  well. 

It  is  simple  to  express  Datalog  queries  in  temporal  logic.  However,  the  question 
whether  or  not  temporal  logic  queries  can  be  expressed  as  Datalog  queries  is  a  non-trivial 
one.  We  determine  the  answer  to  this  question  in  the  rest  of  this  section.  Subsection  4.1 
deals  with  pure  Datalog  and  Subsection  4.2  with  Datalog"  . 

4.1  Temporal  Logic  vs.  Datalog  Queries  for  Datalog 

Theorem  2   TLq  has  more  expressive  power  than  Datalog  queries  for  Datalog  programs. 

Proof:  We  claim  that  the  query  defined  by  the  TLq  formula  A{x)  atnext  B{x)  where 
A  and  B  are  EDBs  is  not  expressible  in  Datalog  .  The  claim  follows  from  the  fact  that 
this  query  is  not  monotone  in  predicate  B  in  general.  In  contrast,  Datalog  programs  axe 
monotone  in  all  their  predicates.  1 

4.2  Temporal  Logic  vs.     Datalog  Queries  for  Negated  Datalog 
with  Inflationary  Semantics 

In  this  section,  we  prove  the  main  technical  result  of  this  paper  that  the  existential  frag- 
ment TLq  of  temporal  logic  with  mild  restrictions  imposed  on  it  has  the  same  expressive 
power  as  Datalog  queries  for  safe  negated  Datalog  programs  with  inflationary  semantics. 
Restrictions  imposed  on  TLq  have  the  following  nature.  The  answer  to  a  Datalog  query 
on  a  safe  Datalog"  program  constitutes  a  finite  relation  because  safe  rules  cannot  pro- 
duce new  symbols.  However,  an  arbitrary  query  from  TLq  can  produce  an  infinite  answer. 
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Therefore,  we  define  the  concept  of  domain  independence  of  temporal  logic  queries  as  a 
generalization  of  domain  independent  relational  queries  [U1188]  and  restrict  TLq  to  domain 
independent  queries  to  guarantee  finite  answers.  Formally,  we  will  prove  that  for  any  safe 
Datalog"  program  P  and  a  domain  independent  query  (f>  on  P  from  TLs  ,  there  is  a  safe 
Datalog"'  program  P'  and  a  Datalog  query  Q  such  that  4>'p  =  Qp>. 

We  also  prove  another  stronger  result:  we  do  not  restrict  queries  from  TLq  to  be 
domain  independent  and  treat  infinite  answers  in  some  "uniform"  fashion  by  introducing 
a  special  constant  u  outside  DOMp  such  that  (loosely  speaking)  the  value  of  a  temporal 
logic  query  on  infinitely  many  tuples  coincides  with  the  value  of  the  query  on  u.  The  exact 
meaning  of  this  statement  will  be  provided  later  in  this  paper. 

We  structure  the  proof  of  the  main  result  of  this  section  as  follows.  First,  we  construct 
ordering  predicates  for  a  temporal  logic  query  (j)  coding  the  order  in  which  various  tuples 
are  added  to  the  IDB  predicates.  Second,  we  show  how  these  predicates  can  be  used  to 
produce  a  Datalog  query  "equivalent"  to  cf).  Third,  we  show  how  these  predicates  can  be 
produced  with  safe  Datalog"  programs.  In  the  next  subsection,  we  provide  an  example 
that  illustrates  these  steps. 

4.2.1      Example 

Consider  a  simple  temporal  formula  <j){x,y,z):  A{x,y)  atnext  B{y,z).  First,  we  define 
the  predicates  coding  the  order  in  which  tuples  are  inserted  into  the  IDB  predicates. 

Let  tji{x,y}  be  the  time  instaxice  when  the  tuple  (x,y)  is  inserted  into  the  predicate 

A,  and  let  tsiy-i  z)  be  the  time  instance  when  the  tuple  (y,  z)  is  inserted  into  the  predicate 

B.  tA{x,y)  and  tB{y,z)  are  well  defined,  because  under  the  inflationary  semantics  of 
Datalog"",  once  a  tuple  is  inserted  into  a  predicate,  it  will  never  be  removed  from  it.  In 
general,  0  <  i^(x,  ?/),is(y,  2)  <  cxd.  f4(x,y)  =  0  means  that  (x,y)  was  in  .4  at  time  0  (and 
therefore  .4  was  an  EDB),  and  i.4(x,y)  =  00  means  that  (x,y)  was  never  inserted  into  .4; 
similarly  for  ig. 

It  follows  from  the  definition  oi  (f>p  ,  the  predicate  induced  by  (f),  and  from  the  definition 
of    atnext  operator  that 

True      if  ^.4(^5  y)  ^  tsiv^  z)  <  00 


j  True      i{tAix,y) 
<Pp[x,y,z)-<^  Fa/^e    otherwise 
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Note  that  by  the  definition  of  atnext,  if  tsiy^z)  =  oo  then  0p{x,y,z)  =  False. 
Furthermore,  observe  that  the  value  of  (pp{x,  y,  z)  depends  only  on  the  relative  times  when 
A(x,  y)  and  B{y,  z)  become  true,  that  is  when  {x,  y)  is  inserted  into  .4  and  (y,  z)  is  inserted 
into  B. 

To  formally  code  these  relative  order  of  "insertion"  times,  we  introduce  11  order- 
ing predicates:  Ro=a=b<oo^  Bo=a<b«x>,  Bo=a<b=oo,  Ro<a=b<(x>,  Ro<a<b<oo,  ■^<.4<s=cc. 
■Ro<^=s=oo,  Ro=B<A<oo,  Ro=B<A=oo,  Rg<b<a<oo-,  Ro<b<a=oo-  These  predicates  cover  all 
possibilities  of  relative  times  of  tuple  insertions  into  predicates  A  and  B. 

The  notational  structure  of  these  predicates  is  RoeoPi0^P292oo-,  where  ^0,^1,^2  £  {  =  <  <} 
and  {Pi,P2}  =  {A,B}.  Such  a  predicate  is  defined  by 

_  f  True      i{0eotp,eitp,e200 
[  False    otherwise 

For  example,  Ro<A=B<oo{x,y,z)  is  true  if  and  only  if  {x,y)  was  not  in  .4  and  (y,^) 
was  not  in  B  at  time  0,  and  (x,y)  was  inserted  into  .4  and  (y,^)  was  inserted  into  B  at 
the  same  time  instance. 

Next,  we  show  how  the  ordering  predicates  can  be  used  to  define  a  Datalog  query 
eqmvalent  to  4>.  As  4>'p{x,y,z)  is  true  if  and  only  if  tA{x,y)  <  tsiy^z)  <  00,  it  follows 
that  0p  is  equivalent  to  R{}=a=b<oo  V  Ro=a<b<oo  V  Ro=a<b=oo  V  Ro<a=b<oo  V  i?o<^<s<oo  • 
Clearly,  if  we  introduce  a  new  predicate  Q  such  that  Q  *—  R^  for  all  the  five  predicates 
Ri  appearing  in  the  previous  disjunction,  then  Q  and  (f)'p  are  equivalent. 

In  the  proof  of  the  main  theorem  we  will  show  how  such  ordering  predicates  in  the 
disjunction  caji  be  computed  with  safe  Datalog"'  programs.  This  means  that  4>  is  equivalent 
to  (5  on  a  safe  Datalog"'  program. 

4.2.2      Ordering  Predicates  for  Quantifier-free  Formulae 

In  this  section,  we  restrict  our  attention  only  to  quantifier-free  temporal  logic  formulae 
TLq  . 

We  formally  define  ordering  predicates  on  a  query  <p  now.   A  temporal  logic  formula 


^In  this  example  it  doesn't  matter  whether  (^(j;,i/)  =  0  or  t^(x,!/)  >  0.  but  in  general  such  distinctions 
need  to  be  made. 
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can  have  several  references  to  the  same  predicate.     Each  such  reference  will  be  called 
an  occurrence  of  a  predicate  in  a  formula.    Two  occurrences  of  the  same  predicate  are 

identical  if  they  have  the  same  list  of  \'ariables,  e.g.    P{x\ Xk)\  otherwise,  they  are 

distinct.  Let  .-li, . . . ,  -4„  be  all  the  distinct  occurrences  of  all  the  predicates  from  query  o, 
and  let  (ij j„)  be  a  permutation  of  (1. ... ,  n). 

For  example,  the  formula  B{x)  atnext  {B{y)  A  C(x,x))  gives  3  predicates,  which  we 
could  write  as  .4i(x).  A2[y).  A-i[x.x). 

Let  X  be  a  sequence  of  some  length  m  listing  in  some  order  all  the  v-ariables  of  o.  For 
each  predicate  .4,,  x,  will  denote  the  actual  variables  of  .4,  in  the  order  they  appear.  Thus 
in  the  above  example,  m  =  2.  x  =  (a:,y),  Xi  =  (x),  X2  =  (y),  X3  =  (x,x).  As  defined  in 
Section  4.2.1,  i.4,(x, )  will  denote  the  time  instance  when  x,  is  inserted  into  .4,.  As  before. 
i.4,(x,)  =  00  means  that  x,  is  not  inserted  into  .4,  at  all. 

For  each  permutation  (ii,  i!2, . . .  ,  in)  of  (1,  2, ...  ,  rz),  we  define  a  set  of  ordering  predi- 
cates i?o5,  A,  9,  A,  e,  ....4,„9,„co(x),  where  Oi^,  ^,, . . . .  ,  ^,„  vary  over  {  =  .<},  and  at  least  one  of 
them  is  <.   Generalizing  from  Section  4.2.1,  i?oe,  a,  0,  a,  e,  ....4,„9,„oo(x)  is  true  if  and  only 

if  Oe,,tA,,  (X.,  )0,jA,,i^.,)0:,  .  .  .  i.4.„(x.J^.„^. 

To  avoid  cumbersome  notation,  we  denote  the  set  of  all  the  ordering  predicates  by 
R  =  {Ri\i  G  /}  for  an  appropriate  index  set  I. 

Next,  we  state  a  technical  lemma  needed  to  prove  Lemma  4. 

Lemma  3  Let  cy{x)    =   OOi^A,  (x,i)^,,i,4,  (x,-2)^,2  . . .  i.4.„(x,„)^,„c». 
Define  the  intervals  of  natural  numbers.  /o,/i....,/n  by 

r[0,i.4.,(x.J)  tfj=0: 

Ij{x)  =  l  [U/x.J,U,,,(x.,„))     t/j  =  l,2.....n-l,- 
Ui.4,„(x,J,oo)  ifj  =  n. 

Let  r(o(x))  be  the  set  of  time  instances  in  which  (p{x)  is  true.  Then  there  is  a  subset  of 
indices  ji,J2i  ■  ■  ■  ^jk  such  that  T{(f){x))  is  the  union  of  non-empty  intervals  J;j(x), . . .  ,  /,^(x), 
for  all  X  satisfying  a(x).    This  means  that  the  indices  ji,  ■ .  .  ,jk  do  not  depend  on  x. 

Proof:    Without  loss  of  generedity,  assume  that  ij   =  j  for  each  j,  and  define  ^0  =  0, 
tn+i  =  CXI.    Note  that  some  of  the  inter\-als  may  be  empty  and  they  are  all  disjoint.   We 
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prove  the  lemma  by  induction  on  the  number  of  operators  in  o,  which  without  loss  of 
generality  are  ->,  V,  atnext.  Recall  that  we  only  consider  x  satisfying  a(x). 

1.  (f)  is  Ai  for  some  i.  Then  T(<f>(x))  =  U{/,(x)|j  >  i  A7,(x)  ^  0}.  The  choice  of  inter\'als 
clearly  does  not  depend  on  x  (as  long  as  a(x)). 

2.  (f)  is  -ic?i.  By  induction,  T(of)i(x))  =  U{/j(x)|j  6  Ji},  where  Ji  does  not  depend  on 
x.  Then  T((?i>(x))  =  U{Jj(x)|j  ^  Ji  A  /j(x)  7^  0}.  Note  that  the  set  of  intervals  does 
not  depend  on  x. 

3.  0  is  <^i  V  <^2-  By  induction,  T((^i(x))  =  U{/j(x)|;  G  Ji}  and  r(<^2(x))  =  U{Jj(x)|j  G 
J2},  where  Ji,  J2  do  not  depend  on  x.  Then  T{(j){-x.))  =  U{/,(x)|j'  G  JiU  J2}-  Clearly, 
the  set  of  intervals  over  which  T((p{-K))  is  true  does  not  depend  on  x. 

4.  cl>  is  (<^i  atnext  02).  Let  J'{j)  =  {j'\{j'  >  j)  A  (Jy(x)  ^  0)  A  (/,-(x)  C  T{U^)))] 
and  let 

r  min(J'(i))    if  J'(i)#0 
I  n  +  1  otherwise 

Both  /i  and  J'  are  well-defined  because,  by  induction,  the  set  of  intervals  comprising 
T((f>2('K))  does  not  depend  on  x  and  because  whether  or  not  /;'(x)  ^  0  also  does  not 
depend  on  x. 

Then, 

r(<^(x))  =  U{/,|(/,o)  C  r(^i(x)))  A  (Mj)  <  n)} 

By  induction,  the  choice  of  intervals  in  0i  does  not  depend  on  x.  Therefore,  the  set 
of  intervals  comprising  r(<^(x))  also  does  not  depend  on  x. 


Lemma  4   Lei  0  he  a  TLq  formula.     Then  for  each  i   either  (Vx)(i?,(x)   =>   (l>p  (x))   or 
(Vx)(i?.(x)  =>-^4>p  (x)). 

Proof:  Since  i?,(x)  holds  if  and  only  if  the  corresponding  ^.(x)  holds,  based  on  Lemma 
3,  all  we  have  to  do  is  to  check  if  /o(x)  C  T{4>o{x)).  B 
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The  above  lemma  partitions  the  index  set  /  into  two  subsets  Ittms  and  I  Falser  where 
hrue  =  {'I  (Vx)(7t:.(x)  =>  <t>'p  (x))  }  and  IpaUe  =  {i\  (Vx)(i?,(x)  =»  -  <?.p  (x))  }.  The  next 
lemma  says  that  the  answer  to  a  query  from  TLq  is  determined  by  some  set  of  ordering 
predicates. 

Lemma  5   For  any  TLq  formula  <p  and  for  all  x,  V,g/j,^^^i?,(x)  =  <;&p(x). 

Proof:  One  part  of  the  lemma  follows  from  Lemma  4.  To  prove  the  other  part,  consider 
any  value  of  x.  Since  i?,'s  are  collectively  exhaustive,  i?,(x)  will  be  true  for  some  i  G  /.  In 
addition,  it  is  easy  to  see  that  this  i  is  in  /jrue-  ' 

Note  that  the  set  Jxrue  is  defined  only  by  the  formula  (p  and  is  independent  of  the 
program  P  or  the  values  of  the  EDBs. 

Based  on  the  above,  the  computation  of  (p'p  can  be  reduced  to  the  computation  of  the 
individual  predicates  i?,.  We  will  examine  under  which  conditions  they  can  be  computed 
with  safe  Datalog"  programs. 

We  define  the  set  of  base  ordering  predicates  for  the  predicate  instances  Ai,  ,42, . . .  ,  ,4„ 
from  (j>.  For  any  i,j  =  1,2, . . . ,  n,  let  x,j  denote  some  sequence  listing  the  union  of  the 
variables  in  x,  and  Xj.  Then  the  base  ordering  predicates  are  divided  into  four  types,  each 
of  them  being  defined  as  follows. 

1-   So=A,i'^i)  for  i  =  1.  2, . . .  ,  n.  So=,4,(x,)  is  true  if  and  only  if  0  =  i^,(x,). 

2.  5o<^.=^^<oo(x,,_,)  for  ij  =  l,2,...,n.    So<>i,=4j<oo(x.,j)  is  true  if  and  only  if  0  < 

tA.i^t)  =  ^.4j(Xj)   <   OO. 

3.  5o<.4.<.4,<co(x,,j)  for  I,  J   =  1,2,  ...,n.    5o<.4.<.4_,<c«=(x,,_,)  is  true  if  and  only  if  0  < 

t4,(x,)  <  i.4_,(Xj)  <  OO. 

4.  5.4,=oo(x,)  for  i  =  1,2,. . .  ,n.  Sa,z=oo{^i)  is  true  if  and  only  if  <^,(x,)  =  oo. 

Predicates  of  types  1,  2,  and  3  will  be  called  bounded;  predicate  of  type  4  will  be  called 
unbounded. 

Note  that,  for  example,  -Ro=.44<.43=.4i<.-i2=/i5=oo  =  ■S'o=.44  A5o<.43=/1i<oo  A5'.42=tx>  A5.45=oo. 
Generally, 
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Lemma  6   Let  {S-j\j  ^   J}   be  some  enumeration  of  the  base  ordering  predicates.     Each 
ordering  predicate  i?,  is  equivalent  to  A-j^j^Sj  for  some  J,  C  J. 

Proof:  Follows  from  the  fact  that  any  Odi^AnOi^At^Oi^  . .  ..4i„^,„oo  can  be  represented  as 
a  conjunction  of  a  set  of  formulae  of  the  form  0  =  A,,  0  <  A,  =  Aj  <  oo,  0  <  .4,  <  Aj  < 

oo,  Ai  =  oo.  I 

Lemma  7  Bounded  base  ordering  predicates  can  be  computed  by  safe  Datalog~'  rules. 

Proof:  We  consider  each  of  the  three  types  in  turn  and  show  for  each  type  how  to  compute 
predicates  of  this  type  with  safe  rules. 

1-  So=Aii^i)  caxi  be  true  only  if  A,  is  an  EDB  and  Ai(x,)  holds.  For  each  EDB  Ai  we 
write  the  rule:  5o=a,(x,)    <—    .4,(x,).  No  rules  are  written  for  IDBs. 

2.  To  handle  this  type  we  want  to  state  that  there  is  a  (finite)  instance  in  time  t  >  0  such 
that  Ai('Xi)  and  Aj(Xj)  both  become  true  for  the  first  time  (simultaneously).  Thus, 
we  want  to  say  that  for  some  i,  A(x,)  and  Aj{Xj)  were  false  at  time  t  —  1  and  became 
true  at  time  t.  To  handle  this,  for  each  IDE  Ai,  we  introduce  a  trailing  predicate  A[ 
with  the  property  that  if  -4,(x,)  became  true  at  t,  i4((x,)  becomes  true  at  time  t  +  1. 
Such  IDBs  A[  can  be  computed  by  the  rules:  -4-(x,)  <—  A,{x,).  Using  the  trailing 
predicates,  we  write  the  rules:  5o<>i,=^j<co(x,,j)   *—  A,{'x.,)A-'A[{x,)AAj{Xj)AA'j{Xj). 

3.  To  handle  this  type,  we  use  the  trailing  predicates  defined  above  and  the  rules  com- 
puting them.  The  derivation  is  slightly  non-intuitive,  but  the  predicates  can  be 
computed  after  adding  the  rules:  5o<,4,<.4j<oo(x,,_,)    <—    .4((x,)  A  .4j(Xj)  A -'.4^(Xj). 


Note  that  the  unbounded  predicates  cannot  in  general  be  computed  with  safe  Datalog"' 
programs  because  they  are  in  general  infinite,  as  is  the  case  for  the  formula  -i(  -'.4(x)  at  next 
A{x))  which  is  true  for  all  x  and  for  all  programs.  We  will  provide  two  solutions  to  this 
problem.  The  first  solution  is  to  restrict  the  consideration  of  TLq  formulae  to  the  class 
of  domain  independent  formulae,  to  be  defined  below,  that  guarantee  finite  instances  of 
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unbounded  predicates.  The  second  approach  will  be  discussed  in  Section  4.2.4.  To  define 
domain  independence,  we  introduce  first  the  notion  of  the  domain  of  a  temporal  logic  query 
with  respect  to  a  program. 

Definition  8  Given  a  Datalog^  program  P  and  a  temporal  logic  formula  (p,  the  domain  of 
(j)  with  respect  to  P.  domp^^,  is  the  set  of  all  the  constants  appearing  in  0,  the  constants 
appearing  ni  all  the  EDB  predicates  of  P,  in  rules  of  P,  and  the  constants  in  all  the  future 
instances  of  predicates  m  P,  i.e.   constants  inferred  by  program,  P. 

Since  we  consider  only  safe  programs,  no  new  constants  will  be  added  to  domp,^  by 
applying  rules  from  P.  Therefore,  the  domain  of  a  safe  formula  contains  only  constants  in  cf)., 
in  EDB  predicates,  and  the  remaining  constants  of  P.  The  domain  of  0  with  respect  to  P, 
domp^^,  gives  rise  to  a  predicate  on  DOM p  which  is  true  on  the  elements  of  domp^^  and  only 
on  these  elements.  If  no  confusion  arises,  we  will  use  the  same  notation  for  that  predicate 
as  for  the  domain  itself.  Also,  domp^^{xi, . . .  ,x„)  will  denote  domp^^{x\)h. .  .Adomp^^{xn). 

Using  Definition  8,  domain  independence    is  defined  as  follows. 

Definition  9  A  temporal  logic  formula  (j)  is  domain  independent  if  for  any  safe  Datalog^ 
program  P  the  predicate  (f>*p  is  the  same  for  any  domain  DOM p  D  domp^^,  i.e  (p'p  does 
not  depend  on  DOMp. 

Note  that  Definition  9  constitutes  an  extension  of  the  definition  of  a  domain  indepen- 
dent formula  [U1188]  to  temporal  logic  and  Datalog"'  programs.  Also  note  that  a  domain 
independent  query  returns  only  finite  answers,  each  constant  in  the  answer  contained  in 
the  domain  domp^^.  Using  the  notions  just  defined,  we  can  state  the  following  lemma. 

Lemma  10  For  each  base  ordering  predicate  Sa,=:oo,  the  predicate  54  _^.-  S a, =00  ^  domp^^ 
can  be  computed  with  safe  Datalog'^  rules. 

Proof:  The  key  observation  is  that  a  safe  Datalog"  program  reaches  a  fixpoint  and  it  is 
possible  to  detect  this  fixpoint  using  safe  Datalog"'  rules,  since  domp,^  is  finite. 

Let  FP  be  the  flag  predicate,  which  becomes  true  one  time  instance  after  the  fixpoint 
is  reached.  Then 

•?.4.=oo(x,)    ^    FP  Adomp,^(x)A^Mx,) 
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Clearly,  the  predicate  domp,^  can  be  computed  with  a  Datalog  program. 
To  finish  the  proof,  note  that  FP  can  be  computed  using  the  rule 

FP    ^    Ar=i((Vx,)(-4,(x.)=^.4:(x,))) 

The  rule  says  that  FP  becomes  true  when  the  predicates  in  P  stop  changing  over  time, 
i.e.  at  the  fixpoint  time.  This  rule  is  not  normalized;  however,  the  variables  x,  range  over 
the  domain  domp^^,  which  is  finite.  Therefore,  we  can  remove  universal  quantifiers  and 
replace  them  with  finite  conjunctions.  After  that,  the  rule  can  be  normalized  to  a  set  of 
safe  Datalog"  rules.  I 

4.2.3      Main  Theorem  for  the  Domain-independent  Case 

In  Section  4.2.2,  we  defined  ordering  and  base  ordering  predicates  and  showed  how  ordering 
predicates  can  be  used  to  compute  the  answer  to  a  query.  We  also  showed  how  base  ordering 
predicates  could  be  computed  with  safe  Datalog"^  rules.  We  are  ready  to  put  these  results 
together  now  and  to  state  the  preUminaxy  version  of  the  main  theorem  in  the  following 
lemma. 

Lemma  11  For  any  safe  Datalog'^  program  P  over  some  (possibly  infinite)  domain  DOM p 
and  any  domain  independent  temporal  logic  query  (j)  in  TLq,  there  exists  a  safe  Datalog'' 
program  P'  over  DOM p  and  a  Datalog  query  Q,  such  that  (p*p  and  Qpi  define  the  same 
predicate,  where  Qp>  denotes  the  IDE  Q  as  computed  by  program  P' 

Proof:  Based  on  Lemmas  5  and  6,  we  can  write: 

(f>'p  =  \/^eITTu^  ^jeJ,  Sj  (1) 

Observe  that  for  base  ordering  predicates  of  of  type  1,  2,  and  3:  Sj  =  5j  A  domp,^,, 
and  to  simplify  subsequent  discussion  we  will  write  S'j  for  Sj  for  predicates  of  this  type. 
(For  base  ordering  predicates  of  type  4,  5'  was  introduced  already  in  the  proof  of  Lemma 
10.) 

Since  (j)'p  is  domain  independent, 

4>'p  =  (t>'p  f\  domp^^  =  y^^Tr^e  ^j^J>  (-^J  ^  domp,4,) 
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which  can  be  also  written  as 

<f>'p  =  V.e/^.„,  A,ej.  5;  (2) 

From  Lemmas  7  and  10  we  know  that  all  S'j,  j  £  J  can  be  computed  with  safe  Datalog'' 

rules. 

The  program  P'  consists  of  program  P,  the  programs  that  compute  predicates  S'  for 
j  £  J  and  the  following  set  of  rules: 

Q(X)      ^      i?.(x),   for    I   G   hrue 
R,{X)      ^     Aj6J.5;(x;),   for   Z    G   /True 

where  x'  denotes  the  variables  of  5^,  and  Q  is  an  IDB  predicate  not  appearing  in  other 
rules.  Q  constitutes  the  query  in  question.  It  follows  from  (2)  that  Q  jmd  0p  define  the 
same  predicate.  1 

Now,  we  are  ready  to  state  the  main  theorem  in  the  version  for  domain  independent 
formulae. 

Theorem  12  For  any  safe  Datalog'^  program,  P  over  some  domain  DOM p  and  any  do- 
main independent  temporal  logic  query  <f)  in  TLs,  there  exists  a  safe  Datalog"  program  P' 
over  DOM p  and  a  Datalog  query  Q  such  that  (f>'p  and  Qpi  define  the  same  predicate. 

Proof:  Consider  a  rig  formula  0.  By  the  definition  of  TI3,  0  can  be  written  as  (3x)(?'(x). 
where  4>'  is  quantifier-free.  By  the  previous  lemma,  we  can  find  a  program  and  a  query  Q' 
that  computes  the  same  result  as  (j)'.  Take  projection  of  Q'  on  the  free  variables  of  6  to 
obtain  the  result.  ' 

Corollary  13  Lei  P  be  a  safe  Datalog'^  program  over  some  domain  DOM p  and  let  6  from 
TLs  be  such  that  all  the  base  ordering  predicates  appearing  m  <pp  as  described  in  (1)  are 
bounded.  Then  there  exists  a  safe  Datalog'^  program  P'  over  DOM p  whose  rules  do  not 
contain  constants  from  the  EDBs  m  P  and  a  Datalog  query  Q,  such  that  0},  and  Qp'  define 
the  same  predicate. 

Proof:  Note  that  we  dropped  the  domain  independence  requirement  because  it  appeared 
only  in  connection  with  unbounded  predicates.  The  proof  follows  immediately  from  the 
construction  of  the  rules  computing  the  bounded  base  ordering  predicates.  ■ 
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Example  1 

We  return  to  the  example  at  the  beginning  of  the  introduction.  The  program  P  was: 

cousin(X,Y)     :-   parent(X,Xp)    &   parent(Y,Yp)    &   parent(XP,Z)    &   parent(YP,Z) 
&   XP   7^   YP. 

cousin(X,Y)     :-   parent(X,Xp)    &  parent(Y,Yp)    &   cousin(Xp ,Yp) . 

We  were  interested  in  computing  relation  closerCousin{x^y,u,v)  defined  with  the 
temporal  logic  formula  cousin{x,y)  before  cousin{u,v).  The  following  Datalog"'  program 
computes  closerCousins  in  a  way  outlined  in  the  proof  of  Theorem  12. 

cousin(X,Y)    :-   parent(X,Xp)    &  parent(Y,Yp)    &  parent(XP,Z)    &  parent(YP,Z) 
&   XP   7^   YP. 

cousin(X,Y)     :-   parent(X,Xp)    &  parent(Y,Yp)    &   cousin(Xp,Yp) . 

cousin' (X,Y)     :-   cousin(X,Y). 

closerCousins(X,Y,U,V)     :-   cousin' (X,Y)    &   cousin(U,V)    &   -.cousin' (U,V) . 

Another  interesting  example  constitutes  the  program  computing  closerThen  relation- 
ship based  on  transitive  closure  relationship  transClosure.  closerThan{x,y,u,v)  is  defined 
as  trans Clo3ure{x,y)  before  trans Clo3ure{u,v).  As  before,  a  DataJog-.  program  can  be 
used  to  compute  closerThan,  thus  computing  which  pairs  of  nodes  in  the  graph  are  closer 
to  each  other  than  other  pairs  of  nodes. 


Corollary  14  For  any  domain  independent  query  (f>  from  TLs  and  a  safe  Datalog'^  pro- 
gram. P  there  is  a  predicate  Q  and  a  safe  Datalog'^  program,  P'  such  that  4>'p  (x.)  = 
Kq  {oQ{'X.)),  where  (as  was  defined  in  Section  2)  K^  is  a  temporal  structure  defined  by 
program  P'  taken  at  present  time. 

Proof:  Follows  from  Theorem  12  and  Proposition  1.  I 

This  corollary  proves  the  collapse  of  the  domain  independent  existential  fragment  of 
temporal  logic  but  only  for  temporal  structures  determined  by  Datalog"  programs  with 
inflationary  semantics.  Specifically,  any  domain  independent  TLs  formula  can  be  reduced 
to  a  simple  formula  oQ  (but  for  a  different  program). 
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The  next  proposition  says,  among  other  things,  that  not  all  the  temporal  logic  queries 
can  be  obtained  with  Datalog  queries  over  safe  Datalog"  programs.  Therefore,  temporal 
logic  queries  have,  generally,  more  expressive  power  than  Datalog  queries  over  safe  Datalog" 
programs  with  inflationary  semantics. 

Proposition  15  For  any  domain  dependent  query  (j)  and  a  safe  ■program  P  there  is  no 
safe  program  P'  and  a  query  Q  such  that  Q  and  4>'p  define  the  same  mapping. 

Proof:  Follows  from  the  fact  that  safe  Datalog"  programs  can  produce  only  symbols  in 
domp^^.  whereas  domain  independent  queries  return  elements  outside  of  this  domain.       I 

4.2.4      Main  Theorem  for  the  Domain  Dependent  Case 

We  now  show  how  to  answer  queries  in  TLs  that  are  not  domain-independent.  Strictly 
speaking,  this  caimot  be  done  using  safe  Datalog"  programs,  as  those  programs  can  only 
compute  finite  predicates,  whereas  answers  to  domain-dependent  queries,  can  in  general," 
be  infinite.  However,  we  can  provide  a  simple  metaprocedure  to  do  so  using  safe  Datalog" 
programs. 

Let  (f>  be  an  arbitrary  query  in  TLj.  Let  a;  be  a  constant  not  in  DOM p,  and  let  Q 
be  a  unary  predicate  not  in  P.  Let  DOM p<  =  DOMp  U  {u;}.  Add  a  new  fact  Q(u;)  to  P, 
obtaining  P' .  The  purpose  of  this  rule  is  to  enlarge  the  domain  of  the  original  program, 
allowing  the  utiUzation  of  the  new  constant  uj.  Note  that  dompi^^  =  domp^^  U  {u;}.  As  in 
the  proof  of  Lemma  11,  we  can  compute  (f)'p,  A  domp>^4,  using  safe  Datalog"  rules. 

Let  X  =  (a;i,X2, . . .  ,  Xm),  where  x,-,  i  =  1,2,  ...,m  range  over  the  whole  domain 
DOMp,  and  define  x[u;]  as  {x[,X2, . . .  ,x'^),  where  for  each  i,  x\  =  x,  if  x,-  G  domp-^^,  and 
x\  =  u;  if  X,  ^  dompi^^. 

Clearly,  for  all  x  €  DOMp,  <!>},  (x)  =  <f)p,{-s.).  However,  one  can  also  show  that  for 
each  X,  <?!>p,(x)  is  true  if  and  only  if  (f)p,{x[uj])  is  true.  In  other  words,  if  some  component  x^ 
of  X  lies  outside  dompj,,  it  does  not  matter  for  the  truth- value  of  (p'p,  what  the  actual  value 
of  X,  is  (as  long  as  it  is  outside  domp^^).  Therefore,  to  determine  whether  <?!>p-(x)  is  true,  it 

Ixl 

is  enough  to  determine  whether  0p,(x[a;])  is  true.  But  x[lj]  ranges  over  dom'p,^.  Therefore, 
it  is  enough  to  compute  (?f)p-(x)  on  domp}^^.  It  follows  from  Lemma  11  that  (l>p,{x.)Adomp,^^ 
is  computable  with  safe  Datalog"  rules.  Therefore,  we  proved  the  following  theorem. 
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Theorem  16  For  any  safe  Datalog'^  program  P  over  some  dom,ain  DOM p  and  any  tem- 
poral logic  query  0  in  TLs,  there  exists  a  safe  Datalog'^  program  P"  over  DOM p  U  {^} 
where  u  ^  DOM p  and  a  Datalog  query  Q  such  that  for  any  x  over  DOM p  ^^(x)  if  and 
only  if  Qpii{-}i.[uj\). 

Note  that  the  replacement  of  x  by  x[u;]  cannot  be  done  by  safe  Datalog"  rules,  and, 
therefore,  we  partially  "step  outside  of"  our  formalism. 

5      Conclusions 

In  this  paper,  we  proposed  a  query  language  for  dynamic  databases.  The  syntax  of  this 
language  is  based  on  the  future-related  fragment  of  temporal  logic  and  its  semantics  on 
Datalog  programs  or  their  extensions  (e.g.  Datalog"  with  inflationary  semantics).  We 
compared  this  lajiguage  with  ordinary  Datalog  queries  in  terms  of  expressive  power.  It 
was  shown  that  temporal  logic  has  strictly  more  expressive  power  than  Datalog  queries 
for  Datalog  and  Datalog"  programs  with  inflationary  semantics.  However,  for  Datalog" 
programs  with  inflationary  semantics  we  have  proven  the  surprising  result  that  the  do- 
main independent  existential  fragment  of  temporal  logic  has  the  same  expressive  power  as 
Datalog  queries^".  This  result  implies  the  collapse  of  the  domain  independent  existential 
fragment  of  temporal  logic  for  Datalog"  programs  with  inflationary  semantics. 

Temporal  logic  as  a  query  language  has  some  other  important  advantages  over  Datalog 
queries  besides  more  expressive  power.  First,  it  is  more  general  than  Datalog  queries 
because  temporal  logic  can  be  asked  on  progran:is  that  have  no  fixpoints,  whereas  Datalog 
queries  depend  on  the  existence  of  a  canonical  fixpoint,  and  because  it  can  be  used  with 
other  formalisms  besides  Datalog.  In  other  words,  temporal  logic  can  be  used  in  the 
contexts  where  Datalog  queries  have  no  meaning.  Second,  as  we  have  axgued,  temporal 
logic  is  generally  easier  to  use  than  Datalog  queries  in  the  cases  when  temporal  logic  queries 
can  be  expressed  with  Datalog  queries:  some  temporal  logic  queries  can  be  expressed  only 
in  a  very  complicated  way  in  Datalog. 

There  has  been  substantial  research  conducted  recently  on  integrating  production 
systems  with  databases.  The  special  issue  of  the  SIGMOD  Record  [SIGMOD89]  provides 


''Some  of  our  results  can  be  generalized  to  formulae  outside  the  existential  fragment. 
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an  overview  of  this  work.  Also,  [AbViSQ]  studied  an  extension  of  Datalog"  with  negations 
allowed  both  in  the  head  and  in  the  body  of  a  rule.  Clearly,  these  two  formalisms  are 
isomorphic.  In  both  of  these  approaches,  Datalog  queries  do  not  make  sense  in  general 
because  a  fixpoint  of  a  program  may  not  exist.  However,  temporal  logic  queries  on  these 
extensions  axe  well-defined,  and,  we  believe,  more  natural.  In  [KeTu89],  we  proposed 
a  general  approach  to  defining  dynamic  databases  based  on  Relational  Discrete  Event 
Systems  and  Models  (RDESes  and  RDEMs).  All  the  formalisms  mentioned  in  this  paper 
constitute  examples  of  RDEMs.  It  turns  out  that  any  RDEM,  and  not  only  Datalog  and 
Datalog",  can  be  used  as  a  semantic  basis  for  temporal  logic  queries.  This  makes  temporal 
logic  a  powerful  approach  for  defining  queries  on  dynamic  databases. 
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